Pesquisa | BVS Economia da Saúde

Electronic Health Records for Population Health Management: Comparison of Electronic Health Record-Derived Hypertension Prevalence Measures Against Established Survey Data.

Allen, Katie S; Valvi, Nimish; Gibson, P Joseph; McFarlane, Timothy; Dixon, Brian E.

Online J Public Health Inform ; 16: e48300, 2024 Mar 13.

Artigo em Inglês | MEDLINE | ID: mdl-38478904

RESUMO

BACKGROUND: Hypertension is the most prevalent risk factor for mortality globally. Uncontrolled hypertension is associated with excess morbidity and mortality, and nearly one-half of individuals with hypertension do not have the condition under control. Data from electronic health record (EHR) systems may be useful for community hypertension surveillance, filling a gap in local public health departments' community health assessments and supporting the public health data modernization initiatives currently underway. To identify patients with hypertension, computable phenotypes are required. These phenotypes leverage available data elements-such as vitals measurements and medications-to identify patients diagnosed with hypertension. However, there are multiple methodologies for creating a phenotype, and the identification of which method most accurately reflects real-world prevalence rates is needed to support data modernization initiatives. OBJECTIVE: This study sought to assess the comparability of 6 different EHR-based hypertension prevalence estimates with estimates from a national survey. Each of the prevalence estimates was created using a different computable phenotype. The overarching goal is to identify which phenotypes most closely align with nationally accepted estimations. METHODS: Using the 6 different EHR-based computable phenotypes, we calculated hypertension prevalence estimates for Marion County, Indiana, for the period from 2014 to 2015. We extracted hypertension rates from the Behavioral Risk Factor Surveillance System (BRFSS) for the same period. We used the two 1-sided t test (TOST) to test equivalence between BRFSS- and EHR-based prevalence estimates. The TOST was performed at the overall level as well as stratified by age, gender, and race. RESULTS: Using both 80% and 90% CIs, the TOST analysis resulted in 2 computable phenotypes demonstrating rough equivalence to BRFSS estimates. Variation in performance was noted across phenotypes as well as demographics. TOST with 80% CIs demonstrated that the phenotypes had less variance compared to BRFSS estimates within subpopulations, particularly those related to racial categories. Overall, less variance occurred on phenotypes that included vitals measurements. CONCLUSIONS: This study demonstrates that certain EHR-derived prevalence estimates may serve as rough substitutes for population-based survey estimates. These outcomes demonstrate the importance of critically assessing which data elements to include in EHR-based computer phenotypes. Using comprehensive data sources, containing complete clinical data as well as data representative of the population, are crucial to producing robust estimates of chronic disease. As public health departments look toward data modernization activities, the EHR may serve to assist in more timely, locally representative estimates for chronic disease prevalence.

Generalizability and portability of natural language processing system to extract individual social risk factors.

Magoc, Tanja; Allen, Katie S; McDonnell, Cara; Russo, Jean-Paul; Cummins, Jonathan; Vest, Joshua R; Harle, Christopher A.

Int J Med Inform ; 177: 105115, 2023 09.

Artigo em Inglês | MEDLINE | ID: mdl-37302362

RESUMO

OBJECTIVE: The objective of this study is to validate and report on portability and generalizability of a Natural Language Processing (NLP) method to extract individual social factors from clinical notes, which was originally developed at a different institution. MATERIALS AND METHODS: A rule-based deterministic state machine NLP model was developed to extract financial insecurity and housing instability using notes from one institution and was applied on all notes written during 6 months at another institution. 10% of positively-classified notes by NLP and the same number of negatively-classified notes were manually annotated. The NLP model was adjusted to accommodate notes at the new site. Accuracy, positive predictive value, sensitivity, and specificity were calculated. RESULTS: More than 6 million notes were processed at the receiving site by the NLP model, which resulted in about 13,000 and 19,000 classified as positive for financial insecurity and housing instability, respectively. The NLP model showed excellent performance on the validation dataset with all measures over 0.87 for both social factors. DISCUSSION: Our study illustrated the need to accommodate institution-specific note-writing templates as well as clinical terminology of emergent diseases when applying NLP model for social factors. A state machine is relatively simple to port effectively across institutions. Our study. showed superior performance to similar generalizability studies for extracting social factors. CONCLUSION: Rule-based NLP model to extract social factors from clinical notes showed strong portability and generalizability across organizationally and geographically distinct institutions. With only relatively simple modifications, we obtained promising performance from an NLP-based model.

Assuntos

Registros Eletrônicos de Saúde , Processamento de Linguagem Natural , Humanos , Algoritmos , Instalações de Saúde

Natural language processing-driven state machines to extract social factors from unstructured clinical documentation.

Allen, Katie S; Hood, Dan R; Cummins, Jonathan; Kasturi, Suranga; Mendonca, Eneida A; Vest, Joshua R.

JAMIA Open ; 6(2): ooad024, 2023 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-37081945

RESUMO

Objective: This study sought to create natural language processing algorithms to extract the presence of social factors from clinical text in 3 areas: (1) housing, (2) financial, and (3) unemployment. For generalizability, finalized models were validated on data from a separate health system for generalizability. Materials and Methods: Notes from 2 healthcare systems, representing a variety of note types, were utilized. To train models, the study utilized n-grams to identify keywords and implemented natural language processing (NLP) state machines across all note types. Manual review was conducted to determine performance. Sampling was based on a set percentage of notes, based on the prevalence of social need. Models were optimized over multiple training and evaluation cycles. Performance metrics were calculated using positive predictive value (PPV), negative predictive value, sensitivity, and specificity. Results: PPV for housing rose from 0.71 to 0.95 over 3 training runs. PPV for financial rose from 0.83 to 0.89 over 2 training iterations, while PPV for unemployment rose from 0.78 to 0.88 over 3 iterations. The test data resulted in PPVs of 0.94, 0.97, and 0.95 for housing, financial, and unemployment, respectively. Final specificity scores were 0.95, 0.97, and 0.95 for housing, financial, and unemployment, respectively. Discussion: We developed 3 rule-based NLP algorithms, trained across health systems. While this is a less sophisticated approach, the algorithms demonstrated a high degree of generalizability, maintaining >0.85 across all predictive performance metrics. Conclusion: The rule-based NLP algorithms demonstrated consistent performance in identifying 3 social factors within clinical text. These methods may be a part of a strategy to measure social factors within an institution.

Combining Nonclinical Determinants of Health and Clinical Data for Research and Evaluation: Rapid Review.

Golembiewski, Elizabeth; Allen, Katie S; Blackmon, Amber M; Hinrichs, Rachel J; Vest, Joshua R.

JMIR Public Health Surveill ; 5(4): e12846, 2019 Oct 07.

Artigo em Inglês | MEDLINE | ID: mdl-31593550

RESUMO

BACKGROUND: Nonclinical determinants of health are of increasing importance to health care delivery and health policy. Concurrent with growing interest in better addressing patients' nonmedical issues is the exponential growth in availability of data sources that provide insight into these nonclinical determinants of health. OBJECTIVE: This review aimed to characterize the state of the existing literature on the use of nonclinical health indicators in conjunction with clinical data sources. METHODS: We conducted a rapid review of articles and relevant agency publications published in English. Eligible studies described the effect of, the methods for, or the need for combining nonclinical data with clinical data and were published in the United States between January 2010 and April 2018. Additional reports were obtained by manual searching. Records were screened for inclusion in 2 rounds by 4 trained reviewers with interrater reliability checks. From each article, we abstracted the measures, data sources, and level of measurement (individual or aggregate) for each nonclinical determinant of health reported. RESULTS: A total of 178 articles were included in the review. The articles collectively reported on 744 different nonclinical determinants of health measures. Measures related to socioeconomic status and material conditions were most prevalent (included in 90% of articles), followed by the closely related domain of social circumstances (included in 25% of articles), reflecting the widespread availability and use of standard demographic measures such as household income, marital status, education, race, and ethnicity in public health surveillance. Measures related to health-related behaviors (eg, smoking, diet, tobacco, and substance abuse), the built environment (eg, transportation, sidewalks, and buildings), natural environment (eg, air quality and pollution), and health services and conditions (eg, provider of care supply, utilization, and disease prevalence) were less common, whereas measures related to public policies were rare. When combining nonclinical and clinical data, a majority of studies associated aggregate, area-level nonclinical measures with individual-level clinical data by matching geographical location. CONCLUSIONS: A variety of nonclinical determinants of health measures have been widely but unevenly used in conjunction with clinical data to support population health research.

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA